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ABSTRACT 

The evidence used in condemning a test as racially 
biased is usually a validity coefficient for one racial group that is 
significantly different from that of another racial group. However, 
both variables in the calculation of a validity coefficient should be 
examined to determine where the bias lies. A study was conducted to 
investigate the construct validity of a set of predictors and a set 
^f criterion scales separately for blacks and whites. Data^were 
collected during a project to validate a test battery and a followup 
study of the effectiveness of the resultant selection procedures i Ss 
totaled 70 blacks and 104 whitest Tests used were the Adaptability 
Testy the Spelling scale , word meaning scale, checking scale and 
copying scale of the Purdue Clerical Adaptability Test and the ten 
scales of the Guilford- Zimmerman Temperament Survey. The criterion 
scales consisted of 5 specially constructed^ behavior ally anchored 
xating scales measuring accuracy, information, attitude, initiative, 
and knowledge of procedures. El-^yen cases of differential validity 
occurred. The pattern of test-ciiterion relationships was obyipusly 
different for the two groups. .The factor patterns for the 15 
predictors were for the most part similar for blacks and whites. The 
analysis of the factor structure of the predictors suggested that 
they measure the same or similar constructs within the tyao groups. 
Results indicate that the criterion scales, more than t^e predictors, 
contributed to the differential validity and single-group validity. 
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The phrase ^Wfalr test discrimination" suggests that this F,uch 
discussed problem can be attributed to the tests used to predict later 
job performance. The evidence used to detaonstrate the phenomenon 
consists of showing that the validity coefficiehc for one racial 
group is significantly different froin that of another racial group. 
\„.i,e inpllcatlon Is that the tests are measuring different constructs 
within different racial subgroups. A common extension of this line 
of reasoning is to argue against tests and their use,^ except in 
those cases where the validity of the test is the same for all racial 
groups Involved. 

There is, however, another factor which must be investigated 
before tests are branded as the culprits. It takes two variables to 
permit the calculation of a validity coefficient, and both of the 
variables must be investigated in terms of their contribution to 
differential validity. If psychological tests are capable of tapping 
different constructs in different racial' groups, then so might the 
common types of subjective criterion rating scales. 

Boehm, in her excellent review of studies shovdng differential 
validity and single group validity, suggests that the criterion may 
be responsible when job performance is measured by supervisory ratings, 
this is supported by Campbell, Pike, and Flaugher (1969) and by Bass 
and Turner (1973) , both finding differences in criterion ratings 
vAilch were related to ethnic group merubership. 

This study represents ah attempt to investigate the construct 
validity of both a set of predictors and a set of criterion scales, 
separately for Blacks and Whites. The data were collected during a 



project to validate a ter,t battery and a folioT?-up study of the 
effectiveness of the resultant selection procedures. This secondary 
analysis vas performed to help understand obtained examples of 
differential and single <iroup valic-ity. 

The original validation study vas conducted vith a sample of 
88 clerical employees in an insurance conpany; 30 Blacks and 58 \fhites. 
One year after the nev; selection procedures were instituted, criterion 
data vere collected for all clerical employees hired during the first 
nine tnohths of that period. This proup consisted of 40 Blacks and 
46 whites. Since this analysis focused on the meaning of the beasures 
rather than on prediction, the txio proups of subjects ^?ere combined 
for a prand total of 70 Blacks and 104 T-Jhites. 

The validation study was a modified concurrent desipn. The 
sample consisted of enrployees who had been tested before being hired: 
hot/ever, the test were not used in. any systematic vay in the hiring 
decision. Tests used trere the Adaptability Test^ the Spelling sca2j> 
Hbrd Meaning scale > Checking scale and Copying scale of the Purdue 
Clerical Adaptability Test, and the ten scales of the Guilford- 
Zimmerman Temperament Survey. The criterion consisted of five 
behaviorally anchored rating scaler, constructed specifically for this 
study. The scales measured dimensions defined as Accuracy, Information, 
Attitude, Initiative, and Knowledge of Procedures. 

There were no statistically significant differences betvreen the 
two racial groups on any of the 15 predictor scales or on any of the 
five criterion scales in the initial study. However^ the pattern 
of intercorrelations did reveal many between group differences. 
While there were some significant predictor-criterion correlations 
within each group, there was only one predictor-criterion pair 



which vas significant for both proups* Step-wise multiple 
regression was used to develop prediction equations for each group. 
Significant tnultiple R*s x;ere obtained for each criterion scale for 
each sub-group, although the equations v?ere markedly diffcrerit^ It 
was assigned at the tipe that the differences in the patterns of 
Intercorrelatlons \7ere due to the differential meanings of the tests 
(test blas^ if you munt) for the two racial groups, 

Uo cross-validation was performed on the original data; rather, 
since the subjects had been on the job for a while, it was decided 
that a follow-up study should be done to check the validity of the 
predictions to the criterion scales using a sample of new employees. 
This was done, and hence the additional subjects for the secondary 
analysis. 

At this point I became interested in exploring the reasons 
underlying the differences in the validities of the various predictors, 
the impetus came from observed^ similarities in the patterns of 
Intercorrelatlons among the predictors for the tvo groups. In terms 
of the construct validity paradigm, these similarities were not 
conslstant with the belief that the predictors were tapping different 
constructs in each of the racial subgroups. 

One way of determining the construct validity of a test is to 
factor analyze the test along with a battery of tests of known 
meaning. The meaning of the new test is then derived from its 
relationships with the constructs measured by the known tests. It 
follows that if the factor structure of a test battery is the same 
for two populations, the tests are measuring the same constructs in 
both. 



The Fvethodology used in tlsis ,study followed thic logic . First, 
the 15 predictor scales vere factor analyzed separately for the two 
groups, and the factor structure's compared. Then, a^cain T;ithin each 
group separately, each criterion scale uas factored with the 15 
predictors. This allowed a bet\7een group compariroa for each 
criterion scale in terms of hov; each was related to the dimensions 
being measured by the predictors. 

RESULTS 

For tne total sample there vas just one instance of a predictor- 
criterion pair which was significant in the sane direction for both 
groups. There uere 11 cases of differential validity^ two of these 
were instances of the validity coefficients being significant for * 
1)0 tK* groups and significantly different fror. each other (and in 
this case, significantly positive for one group and significantly 
negative for the other) . There v;ere also 13 cases of single group 
validity, three significant just for Blacks and ten significant just 
for Ifiiites* The pattern of test-criterion relationships is obviously 
very different for the t\70 group*?.. 

The factor patterns for the 15 rrcdictors are for the most part 
sinilar for Blacks and IJliites (see Table 1.), the only striking 
difference being the greater differentiation on the ability tests 
for TJhites. This not unexpected finding is the result of relatively 
subtle differences in the intercor .elation matrix. The whole pattern 
of intercor relations among the five ability measure's is higher for 
IThites than for Blacks. However, the correlations b^itveen the 



Adaptability test and the Word Scale and betx-Tecn the Checking scale 
and the Copy scale are so much liicl^cr that they result in two factors 
beinc defined instead of one. * 

This analysis of the factor structure of the predictors suggests 
that they are in fact measuring the same or sir^ilar constructs within 
each of the racially defined groups. If thi» were not so, the 
probability of such similar factor structures X70uld be low, indeed. 
There is no need to define or name the constructs being measured by 
the tests at this time. Suffice; it to say that \;hatever they are, 
they are the same for both groups. 

The criterion variables involved in the greatest number of 
instances of differential validity were Attitude and Initiative, 
yhen the Attitude criterion measurements are included in a factor 
analysis with the predictors (see Table 2.) we find that it is 
positively correlated with Factor II for Whites and negatively 
correlated with Factor II for Blacks. The results from the analysis 
of the Initiative scale with the predictors (see Table 3.) are not so 
clear-cut but do again demonstrate the problem. Initiative is not 
related to any of the .common factors for IJhites, while it loads oh 
a previously undetected factor for Blacks. 

The criterion variables of Procedures and Information 
were most often involved in single f^roup validity situations. The 
Knov;ledge of Procedures scale (see Table 4.) was related to Factor 
III for Whites, but to a previously undetected factor for Blacks. 
From the correlation matrix we find that for Blacks > the correlations 
between this criterion scale and the Adaptability Test and the IJord 
scale which define Factor III for \Jhites are -i09 and -.06 respectively. 



Tlie Infcmation scale (nee Tabic 5.) had the Dost disruptive effect 
on. the factor structure uhen added to the analysis i It is related 
to the Word I^eaning test scale for Blacks and to Factor VI, a 
personality dimension, tor TJliitcs. 

The accuracy scale \;as the least v:ell predicted to of the 
lot. Again, factoring it in conjunction vith the predictor scales 
shows that it is related to different predictor dimensions (see 
Table- 6.) • For !7liites it is related to Factor IV, while for Blacks it 
is related to a x;eakly defined personal ; dimension. Also, for 
Blacks, the Accuracy scale correlated -.05 and .04 with the Adapta- 
bility Test and the Uord Meaning scale Khich define Factor IV for 
IJhites. 

Discussion and Conclusions 

These results indicate that the criterion scales, Tuoxe than 
the predictors, are contributing to the differential validity and 
single group validity in this study. The factor structures for the 
predictors are too similar to warrant the conclusion that the tests 
are measuring different constructs in the different racial groups. 
It is also worth noting that the conmonalities for each of the criterion 
scales x^ere, \:ith two exceptions, almost identical to the squared 
multiple R's obtained for the prediction equations. This r.eans that 
the majority of the valid variance is common variance in the current 
analysis and that there are no minor factors, undetected here, 
which could be different for the racial groups and therefore account 
for the differential validity. 

Exactly what is happening is open to interpretation and will 



require r.ore study. Thera appears to be a general trenc- in the data 
indicating that the performance ratings of Blacks are related to 
personality dimensions althouffh not usually the major personality 
dimensions defined by the predictor iicales in this study. This is not 
so evident for !rnites> especially ^ Uet; ve note that the criterion 
scales of Knox?ledge of Procedures and Accuracy, tLe tuo real ability 
scales J are related to ability i-'eariures in l"\e predictor dorain* 

This study also raises an interesting point. By all objective 
indices, the validation study vhich vield'^d these data resulted in. 
a selection procedure vhich complies with the uEOC guidelines i There 
were no racial differences for mean scores on predictors or criterion 
scales. Separate regression equations t-jere developed for each racial 
group. Use of the equations did not have an adverse impact > in 
fact nev hires vere running very close to 50-50 in an urban area 
which is almost 50% Black. Yet ve find racial differences in the 
type of person uho is considered an effective perforiiier on the job* 
Therefore, vjhile such procedures may pass^ muster based on current 
standards, the evidence indicates another example of ecual but 
different treatment. 

One can only speculate about the long run effects* If such 
procedures are common it could have an effect on the evolution of 
personality structure in two sediments of the population by differentially 
rewarding personality types. ?!ore , pragmatically, such practices 
may compound the problems in elimitiatinf; racial discrimination in 
promotion decisions. Over the years the organization will amass 
two different populations distinguishable on the dimension of race. 
If the abilities and personality characteristics necessary for 



success at the higher level position are different than at the 
lover I'ivelt then it is very likely that the base rate for success 
td.ll be different for the tv:o population*?, Tlie organization vill 
then have to promote persons not likely to succeed or have their 
promotion procedures subjected to attack on the basis of adverse 
inpact* Tlierefore what looks like a solution to unfair discrimination 
at one level of personnel procedures may lead to greater probletis 
at another level. 

Finally, the evidence reportec. here suggests that it nay be 
less important to worry about discrifnination based on test usage 
and more important to concentrate on fairness in performance ratings . 
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Table 1. Predictor factor structure '.'or Blacks and Whites. 



Whites 



II II] 



44 
69 
88 
34 



60 
93 
33 
36 



76 
43 
74 



IV 
48 
60 
60 
52 
65 



Adaptability 
Word 
Checking 
Copy 
Spelling 
Gen. Activity 
Ascendancy 
Sociability 
Thoughtfulness 
EKiotional Stab. 
Objectivity 
Friendliness 
Personal Relat. 
Restraint 
Masculinity 

% Total" Var. 11.6 11.2 9.8 11.7 



68 
71 
64 
34 



II III 



53 
90 
53 
70 



10 .« 14 



33 
46 
50 

5.7 



IV 
80 
69 



VI 



55 
65 



-34 
40 

44 



53 

9.5 7.1 6.7 



Table 2. Factor structures with criterion scale: ATTITUDE 



Blacks 



I II III IV 
Attitude -33 
Adaptability 48 
Word 72 
Checking 52 
Copy 45 
Spelling 61 
Gen. Activity 43 
Ascendancy 73 
Sociability 84 
Thoughtfulness 36 
Emotional Stab. 67 
Objectivity 80 
Friendliness 30 76 

Personal Relat. 37 43 

Restraint 76 
Masculinity 



Whites 



I II III IV V VI 
27 

86 

64 -34 
66 

X 60 

-51 

65 
74 
65 

36 -44 

64 41 
84 

30 76 
72 



33 
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Table 3. Factor structure with criterion scale: INITIATIVE 



Blacks 



vrhites 



Initiative 

Adaptability 

Word 

Checking 

Copy 

Spelling 

Gen. Activity 40 
Ascendancy 77 
Sociability 82 
Thoughtfulness 35 
Enotional Stab. » 
Objectivity 
Friendliness 
Personal Relat. 
Restraint 
Masculinity 



II 



65 
84 
34 

37 



III IV 

47 
61 
67 
52 
63 



34 
43 
74 



V 

-62 



38 



30 



37 



69 
74 
63 
35 



II III 



53 
90 
55 

70 



30 
48 
49 



IV 

78 
70 



31 



V VI 
32 



51 
66 



-31 
35 

41 



54 



Table 4. Factor structure with criterion scale: KMOWtSDCB OF PROCEDJMBS 



Blacks 



Whites 





I- 


II 


III 


IV 


V 


I 


IZ 


Knowledge 








48 


66 






Adaptability 














word 








60 








Checking 








63 








copy 








53 








Spelling 








64 




64 




(Sen. Activity 


39 








-34 




A8c<endan<;y 


77 










77 




sociability 


82 










61 




Thoughtfulness 


36 


59 








38 


64 


Emotional Stab. 












Objectivity 




91 










92 


Friendliness 




31 


77 








43 


Personal ReXat. 




35 


43 








69 


Restraint 






74 






• 




Masculinity 

















III 



-37 



59 
34 



IV 
39 
95 
48 



58 
61 



VI 



-47 



-53 
35 



44 



31 
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Table 5. Factor structure with criterion scale: INFORMATION 

Blacks Whites 

I II III IV V I II III IV V VI 

:.n£ormation 34 42 

Adaptability 36 73 37 

Word 40 57 70 

Checking 82 52 

Copy 53 67 

Spelling 52 37 

Gen. Activity 44 67 

Ascendancy 75 78 31 

Sociability 83 62 
Thoughtfulness 36 38 

Eiuotional Stab. 70 52 31 -38 

Objectivity 82 86 

Friendliness 36 73 58 37 

Personal Relat. 39 42 70 

Restraint 76 52 

Masculinity -59 



Table 6. Factor structure with criterion scale: ACCURACY 



Blacks 



Whites 



Accuracy 

Adaptability 

Word 

Checking 

Copy 

Spelling 

Gen. Activity 44 
Ascendancy 73 
Soci?.bility 86 
Thoughtfulness 33 
Emotional Stab. 
Objectivity 
Friendliness 
Personal Relat. 
Restraint 
Masculinity 



II 



32 



57 
81 
31 
36 



III 



77 
43 
74 



IV 

45 
56 
71 
54 
63 



V 
63 



31 
-31 



64 
75 
61 
40 



II HI 



59 
92 
49 
70 



-38 



46 
51 



IV 
29 
90 

58 



V 



60 
62 



VI 



39 



50 



-47 



-40 



